(1) GENERAL INFORMATION: 




SEQUENCE LISTING 



(i) Applicant: LaVallie, Edward 

Racie, Lisa 



(ii) TITLE OF INVENTION: HUMAN SDF-5 PROTEIN AND COMPOSITIONS 



(iii) NUMBER OF SEQUENCES: 7 



(iv) CORRESPONDENCE ADDRESS: 

(A) ADDRESSEE: GENETICS INSTITUTE, INC. 

(B) STREET: 87 CAMBRIDGEPARK DRIVE 

(C) CITY: CAMBRIDGE 

(D) STATE: MA 

( E ) COUNTRY : USA 

(F) ZIP: 02140 

(v) COMPUTER READABLE FORM: 

(A) MEDIUM TYPE: Floppy disk 

(B) COMPUTER: IBM PC compatible 

(C) OPERATING SYSTEM: PC-DOS/MS-DOS 

(D) SOFTWARE: Patentin Release #1.0, Version #1.25 




(vi) CURRENT APPLICATION DATA: 

(A) APPLICATION NUMBER: 08/848,439 

(B) FILING DATE: 08-MAY-1997 

(C) CLASSIFICATION: 

(viii) ATTORNEY /AGENT INFORMATION: 

(A) NAME: GYURE, BARBARA A. 

(B) REGISTRATION NUMBER: 34,614 

(C) REFERENCE /DOCKET NUMBER: GI 52 88A 

(ix) TELECOMMUNICATION INFORMATION: 

(A) TELEPHONE: (617) 498-8653 

(B) TELEFAX: (617) 876-5851 



( 2 ) INFORMATION FOR SEQ ID NO : 1 : 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 2027 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO : 1 : 
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( 




GAATTCGGCC TTCATGGCCT AGCTCATTCT GCTCCCCCGG GTCGGAGCCC CCCGGAGCTG 60 

CGCGCGGGCT TGCAGCGCCT CGCCCGCGCT CCTCCCGGTG TCCCGCTTCT CCGCGCCCCA 12 0 

GCCGCCGGCT GCCAGCTTTT CGGGGCCCCG AGTCGCACCC AGCGAAGAGA GCGGGCCCGG 180 

GACAAGCTCG AACTCCGGCC GCCTCGCCCT TCCCCGGCTC CGCTCCCTCT GCCCCCTCGG 2 40 

GGTCGCGCGC CCACGATGCT GCAGGGCCCT GGCTCGCTGC TGCTGCTCTT CCTCGCCTCG 3 00 

CACTGCTGCC TGGGCTCGGC GCGCGGGCTC TTCCTCTTTG GCCAGCCCGA CTTCTCCTAC 3 60 

AAGCGCAGCA ATTGCAAGCC CATCCCGGCC AACCTGCAGC TGTGCCACGG CATCGAATAC 42 0 

CAGAACATGC GGCTGCCCAA CCTGCTGGGC CACGAGACCA TGAAGGAGGT GCTGGAGCAG 4 80 

GCCGGCGCTT GGATCCCGCT GGTCATGAAG CAGTGCCACC CGGACACCAA GAAGTTCCTG 540 

TGCTCGCTCT TCGCCCCCGT CTGCCTCGAT GACCTAGACG AGACCATCCA GCCATGCCAC 600 

TCGCTCTGCG TGCAGGTGAA GGACCGCTGC GCCCCGGTCA TGTCCGCCTT CGGCTTCCCC 660 

TGGCCCGACA TGCTTGAGTG CGACCGTTTC CCCCAGGACA ACGACCTTTG CATCCCCCTC 72 0 

GCTAGCAGCG ACCACCTCCT GCCAGCCACC GAGGAAGCTC CAAAGGTATG TGAAGCCTGC 7 80 

AAAAATAAAA ATGATGATGA CAACGACATA ATGGAAACGC TTTGTAAAAA TGATTTTGCA 840 

CTGAAAATAA AAGTGAAGGA GATAACCTAC ATCAACCGAG ATACCAAAAT CATCCTGGAG 9 00 

ACCAAGAGCA AGACCATTTA CAAGCTGAAC GGTGTGTCCG AAAGGGACCT GAAGAAATCG 9 60 

GTGCTGTGGC TCAAAGACAG CTTGCAGTGC ACCTGTGAGG AGATGAACGA CATCAACGCG 1020 

CCCTATCTGG TCATGGGACA GAAACAGGGT GGGGAGCTGG TGATCACCTC GGTGAAGCGG 10 80 

TGGCAGAAGG GGCAGAGAGA GTTCAAGCGC ATCTCCCGCA GCATCCGCAA GCTGCAGTGC 1140 

TAGTCCCGGC ATCCTGATGG CTCCGACAGG CCTGCTCCAG AGCACGGCTG ACCATTTCTG 12 00 

CTCCGGGATC TCAGCTCCCG TTCCCCAAGC ACACTCCTAG CTGCTCCAGT CTCAGCCTGG 12 60 

GCAGCTTCCC CCTGCCTTTT GCACGTTTGC ATCCCCAGCA TTTCCTGAGT TATAAGGCCA 13 2 0 

CAGGAGTGGA TAGCTGTTTT CACCTAAAGG AAAAGCCCAC CCGAATCTTG TAGAAATATT 13 80 

CAAACTAATA AAATCATGAA TATTTTTATG AAGTTTAAAA ATAGCTCACT TTAAAGCTAG 1440 

TTTTGAATAG GTGCAACTGT GACTTGGGTC TGGTTGGTTG TTGTTTGTTG TTTTGAGTCA 15 00 

GCTGATTTTC ACTTCCCACT GAGGTTGTCA TAACATGCAA ATTGCTTCAA TTTTCTCTGT 15.60 

GGCCCAAACT TGTGGGTCAC AAACCCTGTT GAGATAAAGC TGGCTGTTAT CTCAACATCT 1620 

TCATCAGCTC CAGACTGAGA CTCAGTGTCT AAGTCTTACA ACAATTCATC ATTTTATACC 1680 
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TTCAATGGGA ACTTAAACTG TTACATGTAT CACATTCCAG CTACAATACT TCCATTTATT 17 40 

AGAAGCACAT TAACCATTTC TATAGCATGA TTTCTTCAAG TAAAAGGCAA AAGATATAAA 1800 

TTTTATAATT GACTTGAGTA CTTTAAGCCT TGTTTAAAAC ATTTCTTACT TAACTTTTGC 1860 

AAATTAAACC CATTGTAGCT TACCTGTAAT ATACATAGTA GTTTACCTTT AAJ^lAGTTGTA 19 20 

AAAATATTGC TTTAACCAAC ACTGTAAATA TTTCAGATAA ACATTATATT CTTGTATATA 19 80 

AACTTTACAT CCTGTTTTAC CTAAAAAAAA AAAAAAAAAG CGGCCGC 2 027 

(2) INFORMATION FOR SEQ ID NO : 2 : 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 2 95 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO : 2 : 

Met Leu Gin Gly Pro Gly Ser' Leu Leu Leu Leu Phe Leu Ala Ser His 
15 10 15 

Cys Cys Leu Gly Ser Ala Arg Gly Leu Phe Leu Phe Gly Gin Pro Asp 

20 25 30 

Phe Ser Tyr Lys Arg Ser Asn Cys Lys Pro lie Pro Ala Asn Leu Gin 
35 40 45 

Leu Cys His Gly He Glu Tyr Gin Asn Met Arg Leu Pro Asn Leu Leu 
50 55 60 

Gly His Glu Thr Met Lys Glu Val Leu Glu Gin Ala Gly Ala Trp He 
65 70 75 80 

Pro Leu Val Met Lys Gin Cys His Pro Asp Thr Lys Lys Phe Leu Cys 

85 90 95 

Ser Leu Phe Ala Pro Val Cys Leu Asp Asp Leu Asp Glu Thr He Gin 

100 105 110 

Pro Cys His Ser Leu Cys Val Gin Val Lys Asp Arg Cys Ala Pro Val 
115 120 125 

Met Ser Ala Phe Gly Phe Pro Trp Pro Asp Met Leu Glu Cys Asp Arg 
130 135 140 



Phe Pro Gin Asp Asn Asp Leu Cys He Pro Leu Ala Ser Ser Asp His 
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150 
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Leu Leu Pro Ala Thr Glu Glu Ala Pro Lys Val Cys Glu Ala Cys Lys 

165 170 175 

Asn Lys Asn Asp Asp Asp Asn Asp lie Met Glu Thr Leu Cys Lys Asn 

180 185 ' 190 

Asp Phe Ala Leu Lys lie Lys Val Lys Glu lie Thr Tyr lie Asn Arg 
195 200 205 

Asp Thr Lys lie lie Leu Glu Thr Lys Ser Lys Thr lie Tyr Lys Leu 
210 215 220 

Asn Gly Val Ser Glu Arg Asp Leu Lys Lys Ser Val Leu Trp Leu Lys 
225 230 235 240 

Asp Ser Leu Gin Cys Thr Cys Glu Glu Met Asn Asp lie Asn Ala Pro 

245 250 255 

Tyr Leu Val Met Gly Gin Lys Gin Gly Gly Glu Leu Val lie Thr Ser 

260 265 270 

Val Lys Arg Trp Gin Lys Gly Gin Arg Glu Phe Lys Arg lie Ser Arg 
275 280 285 

Ser He Arg Lys Leu Gin Cys 
290 295 



INFORMATION FOR SEQ ID NO : 3 : 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 275 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO : 3 : 

Ser Ala Arg Gly Leu Phe Leu Phe Gly Gin Pro Asp Phe Ser Tyr Lys 
15 10 15 

Arg Ser Asn Cys Lys Pro lie Pro Ala Asn Leu Gin Leu Cys His Gly 

20 25 30 

He Glu Tyr Gin Asn Met Arg Leu Pro Asn Leu Leu Gly His Glu Thr 
35 40 45 

Met Lys Glu Val Leu Glu Gin Ala Gly Ala Trp He Pro Leu Val Met 
50 55 60 
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Lys Gin Cys His Pro Asp Thr Lys Lys Phe Leu Cys Ser Leu Phe Ala 
65 70 75 80 



Pro Val Cys Leu Asp Asp Leu Asp Glu Thr lie Gin Pro Cys His Ser' 

85 90 95 

Leu Cys Val Gin Val Lys Asp Arg Cys Ala Pro Val Met Ser Ala Phe 

100 105 110 

Gly Phe Pro Trp Pro Asp Met Leu Glu Cys Asp Arg Phe Pro Gin Asp 
115 120 125 

Asn Asp Leu Cys lie Pro Leu Ala Ser Ser Asp His Leu Leu Pro Ala 
130 135 140 

Thr Glu Glu Ala Pro Lys Val Cys Glu Ala Cys Lys Asn Lys Asn Asp 
145 150 155 160 

Asp Asp Asn Asp lie Met Glu Thr Leu Cys Lys Asn Asp Phe Ala Leu 

165 170 175 

Lys He Lys Val Lys Glu He Thr Tyr He Asn Arg Asp Thr Lys He 

180 185 - 190 

He Leu Glu Thr Lys Ser Lys Thr He Tyr Lys Leu Asn Gly Val Ser 
195 200 205 

Glu Arg Asp Leu Lys Lys Ser Val Leu Trp Leu Lys Asp Ser Leu Gin 
210 215 220 

Cys Thr Cys Glu Glu Met Asn Asp He Asn Ala Pro Tyr Leu Val Met 
225 230 235 240 

Gly Gin Lys Gin Gly Gly Glu Leu Val He Thr Ser Val Lys Arg Trp 

245 250 255 

Gin Lys Gly Gin Arg Glu Phe Lys Arg He Ser Arg Ser He Arg Lys 

260 265 270 

Leu Gin Cys 
275 



INFORMATION FOR SEQ ID NO : 4 : 

( i ) SEQUENCE CHARACTERISTICS : 

(A) LENGTH: 15 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 
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(xi) SEQUENCE DESCRIPTION: SEQ ID NO : 4 : 
CATGGGCAGC TCGAG 



(2) INFORMATION FOR SEQ ID NO : 5 : 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 34 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO : 5 : 
CTGCAGGCGA GCCTGAATTC CTCGAGCCAT CATG 



(2) INFORMATION FOR SEQ ID NO : 6 : 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 68 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 
^ (D) TOPOLOGY: linear 



(ii) MOLECULE TYPE: DNA (genomic) 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO : 6 : 
CGAGGTTAAA AAACGTCTAG GCCCCCCGAA CCACGGGGAC GTGGTTTTCC TTTGAAAAAC 
ACGATTGC 



(2) INFORMATION FOR SEQ ID NO : 7 : 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 28 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 




(ii) MOLECULE TYPE: DNA (genomic) 
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(xi) SEQUENCE DESCRIPTION: SEQ ID NO : 7 



ATCGATGCCG TGGCACAGCT GCAGGTTG 
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